Compression of flow can reveal overlapping modular organization in networks 
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To better understand the overlapping modular organization of large networks with respect to flow, 
here we introduce the map equation for overlapping modules. In this information-theoretic frame- 
work, we use the correspondence between compression and regularity detection. The generalized 
map equation measures how well we can compress a description of flow in the network when we par- 
tition it into modules with possible overlaps. When we minimize the generalized map equation over 
overlapping network partitions, we detect modules that capture flow and determine which nodes 
at the boundaries between modules should be classified in multiple modules and to what degree. 
With a novel greedy search algorithm, we find that some networks, for example, the neural network 
of C. Elegans, are best described by modules dominated by hard boundaries, but that others, for 
example, the sparse European road network, have a highly overlapping modular organization. 



I. INTRODUCTION 

To discern higher levels of organization in large social 
and biological networks ^{5] , researchers have used hard 
clustering algorithms to aggregate highly interconnected 
nodes into non-overlapping modules [SHE] because they 
have assumed that each node only plays a single modu- 
lar role in a network. Recently, because researchers have 
realized that nodes can play many roles in a network, 
they have detected overlapping modules in networks us- 
ing three approaches: a hard clustering algorithm that 
is run multiple times [9l [10] ; a local clustering method 
that generates independent and intersecting modules [Tll - 
114] : and link clustering that assigns boundary nodes to 
multiple modules 15 17 . However, all these approaches 
have limitations. The first and second approaches require 
several steps or tunable parameters to infer overlapping 
modules and the third approach necessarily overlaps all 
neighboring modules. To find simultaneously the num- 
ber of modules in a network, which nodes belong to 
which modules, and which nodes should belong to multi- 
ple modules and to what degree, we use an information- 
theoretic approach and the map equation J5j. 

We are interested in the dynamics on networks and 
what role nodes on the boundaries between modules play 
with respect to flow through the system. For example, 
in Fig. [TJa), Keflavik airport in Reykjavik connects Eu- 
rope and North America in the global air traffic network. 
When we summarize the network in modules with long 
flow persistence times, should Reykjavik belong to Eu- 
rope, North America, or both? In our framework, the 
answer depends on the traffic flow. That is, Reykjavik's 
role in the network depends on to what degree passen- 
gers visit Iceland as tourists versus to what degree they 
use Keflavik as a transit between North America and 
Europe. If we assign the boundary node to both mod- 
ules, for returning flow we can increase the time the flow 
stays in the modules and decrease the transition rate be- 
tween the modules, but for transit flow, the transition 
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rate does not decrease and a single module assignment 
is preferable. By generalizing the information theoretic 
clustering method called the map equation [T^ to over- 
lapping structures, we can formalize this observation and 
use the level of compression of a modular description of 
the flow through the system to resolve the fuzzy bound- 
aries between modules. With this approach, modules will 
overlap if they correspond to separate flow systems with 
shared nodes. 

In the next section, we review the map equation frame- 
work, introduce the map equation for overlapping mod- 
ules, and explain how it exploits returning flow near mod- 
ule boundaries. The mathematical framework works for 
both generalized and empirical flow, but here we illus- 
trate the method by exploring the overlapping modu- 
lar structure of several real-world networks based on the 
probability flow of a random walker. We also test the 
performance on synthetic networks and compare the re- 
sults with other clustering algorithms. Finally, in the 
Materials and Methods section, we provide complete de- 
scriptions of the map equation for overlapping modules 
and the novel search algorithm. 



II. RESULTS AND DISCUSSION 

A. The map equation 

The mathematics of the map equation are designed to 
take advantage of regularities in the flow that connects 
a system's components and generates their interdepen- 
dence. The flow can be, for example, passengers travel- 
ing between airports, money transferred between banks, 
gossip exchanged among friends, people surfing the web, 
or, what we use here as a proxy for real flow, a random 
walker on a network guided by the (weighted directed) 
links of the network. Speciflcally, the map equation mea- 
sures how well different partitions of a network can be 
used to compress descriptions of flow on the network and 
utilizes the rationale of the minimum description length 
principle. Quoting Peter Griinwald |20j : ... [Ejvery regu- 
larity in the data can be used to compress the data, i.e., to 
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describe it using fewer symbols than the number of sym- 
bols needed to describe the data literally." That is, the 
map equation gauges how successful different network 
partitions are at finding regularities in the flow on the 
network. 

We employ two regularities for compressing flow on a 
network. First, we use short code words for nodes visited 
often and, by necessity, long code words for nodes visited 
rarely, such that the average code word length will be 
as short as possible. Second, we use a two-level code 
for module movements and within-module movements, 
such that we can reuse short node code words between 
modules with long persistence times. 

Because we are not interested in the actual code words, 
but only in the theoretical limit of compression, we use 
Shannon's source coding theorem ^21j , which establishes 
the Shannon entropy H(j3) as the lower limit of the aver- 
age number of bits per code word necessary to encode a 
message, given the probability distribution p of the code 
words. 

Hip) = -^pJogaPj. 

i 

For example, if there is a message "ABABBAAB..." 
for which the symbols "A" and "B" occur randomly with 
the same frequency, that is, "A" and "B" are independent 
and identically distributed, the source coding theorem 
states that no binary language can describe the message 
with less than — ^ log2 5 — 5 log2 5 = 1 bit per symbol. 
However, if "A" occurs twice as often as "B", the reg- 
ularity can be exploited and the message compressed to 
— I log2 I — I log2 I ~ 0.92 bit per symbol. To measure 
the per-step minimum average description length of flow 
on a network, we collect the mapping from symbols "A" 
and "B" or, in our case, node names, to code words in a 
codebook, and calculate the Shannon entropy based on 
the node- visit frequencies. 

But flow or a random walker do not visit nodes in- 
dependently. For example, if a network has a modular 
structure, once a random walker enters a tightly inter- 
connected region in the network, in the next step she 
will most likely visit a node in the same tightly inter- 
connected region, and she tends to stay in the region 
for a long time. To take advantage of this regularity 
and further compress the description of the walk, we use 
multiple module codebooks, each with an extra exit code 
that is used when the random walker exits the module, 
and an index codebook that is used after the exit code to 
specify which module codebook is to be used next. Now 
we can make use of higher-order structure in a network. 
For a modular network, we can describe flow on the net- 
work without ambiguities in fewer bits, using a two-level 
code, than we could do with only one codebook, because 
we only use the index codebook for movements between 
modules and can reuse short code words in the smaller 
module codebooks. 

Given a network partition M , it is now straightforward 
to calculate the per-step minimum description length 
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FIG. 1. The map equation for overlapping modules can ex- 
ploit regularities in the boundary flow between modules. The 
three lines in (b) show the description length as a function 
of the proportion of returning passengers for three different 
partitions: North America, Europe, and Reykjavik in one 
big module (green); North America and Europe in two non- 
overlapping modules with Reykjavik in either of the modules 
(black); and North America and Europe in two overlapping 
modules with Reykjavik in both modules (blue). 

L(M) of flow on the network. We use the Shannon en- 
tropy to calculate the average description length of each 
codebook and weight the average lengths by their rates 
of use. For a modular partition M with m modules, the 
map equation takes the form: 

m 

L{M)^q^H{Q) + J2pl,H{V^). (1) 

1=1 

For between-module movements, we use g^v for the rate 
of use of the index codebook with module code words 
used according to the probability distribution Q. For 
within-module movements, we use for the rate of use 
of the i-th codebook with node and exit code words used 
according to the probability distribution V^. 

By minimizing the map equation over network par- 
titions, we can resolve how many modules we should 
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use and which nodes should be in which modules to 
best capture the dynamics on the network. See http: 
|//www . mapequat ion . org for a dynamic visualization of 
the mechanics of the map equation. Because the map 
equation only depends on the rates of node visits and 
module transitions, it is universal to all flow for which 
the rates of node visits and module transitions can be 
measured or calculated. The code structure of the map 
equation can also be generalized to make use of higher- 
order structures. In ref. |22j . we show how a multilevel 
code structure can reveal hierarchical organization in net- 
works, and in the next section, we show that we can 
capitalize on overlapping structures by releasing the con- 
straint that a node can only belong to one module code- 
book. 



B. The map equation for overlapping modules 

The code structure of the map equation framework is 
flexible and can be modified to uncover different struc- 
tures of a network as long as flow on the network can 
be unambiguously coded and decoded. As we will show 
here, by releasing the constraint that a node can only be- 
long to one module codebook and allowing nodes to be 
information free ports, we can reveal overlapping mod- 
ular organization in networks. To see how, let us again 
study the air traffic between North America and Europe 
in Fig. [ij^a) . Suppose that cities in North America and 
Europe belong to two different modules, for simplicity 
identical in size and composition, and we are to assign 
membership to Reykjavik between North America and 
Europe. For a hard partition, we would assign Reykjavik 
to the module that most passengers travel to and from, 
and if the traffic flow were the same, we could chose either 
module. But if the flow to and from Reykjavik were dom- 
inated by American and European tourists visiting Ice- 
land for sightseeing before returning to their home con- 
tinent, both Americans and Europeans would consider 
Iceland as part of their territory. We can accommodate 
for this view if we allow nodes to belong to multiple mod- 
ule codebooks; depending on the origin of the flow, we 
use different code words for the same node. 

With the map equation for overlapping modules, we 
can measure the description length of flow on the net- 
work with nodes assigned to multiple modules. By min- 
imizing the map equation for overlapping modules, we 
can not only resolve into how many modules a network 
is organized and which nodes belong to which modules, 
but also which nodes belong to multiple modules and to 
what degree. 

The pattern of flow, returning tourists to Iceland or in- 
transit businessmen on intercontinental trips, determines 
whether we should assign Reykjavik to North America, 
Europe, or both. Or, conversely, when we decide whether 
Reykjavik should be assigned to North America, Europe, 
or both, we reveal the pattern of boundary flow between 
modules, as Fig. [l] illustrates. In this hypothetical ex- 



ample, assigning cities to two non-overlapping modules 
is always better than assigning all cities to one module. 
But for a sufficiently high proportion of returning fiow, 
the overlapping modular solution with Reykjavik in both 
modules as a free port provides the most efficient parti- 
tion to describe flow on the network. 

The map equation for overlapping modules can take 
advantage of regularities in the boundary flow between 
modules. To measure the length of an overlapping mod- 
ular description of flow on a network, we must decide 
how the flow switches modules to calculate the node- 
visit rates from different modules of multiply assigned 
nodes. In the Materials and Methods section, we provide 
a detailed description of how a random walker moves in 
an overlapping modular structure, but the rule is sim- 
ple: when a random walker arrives at a node assigned to 
multiple modules, the walker remains in the same mod- 
ule if possible. Otherwise, the random walker switches, 
with equal probability, to one of the modules to which 
the node is assigned. 

Figure [2] illustrates the code structure of a hard and a 
fuzzy partition of an example network with the dynam- 
ics derived from a random walker. For this network, the 
figure shows that an overlapping modular description al- 
lows us to describe the path of a random walker with 
fewer bits than we could do with a hard network par- 
tition. With overlapping modules, we halve the use of 
the index codebook, since the rate of module switching 
halves. Because we consequently use the exit codes in the 
now identical module codebooks less often, the descrip- 
tion of movements within modules also becomes shorter, 
even if the average code word length increases. Turning 
the reasoning around again, given the overlapping mod- 
ular organization, we have learned that returning flow 
characterizes the boundary flow between the modules. 

With the mathematical foundation in place, we need 
an algorithm that can discover the best partition of the 
network. In particular, which nodes should belong to 
multiple modules and to what degree? For this opti- 
mization problem, we have developed a greedy search 
algorithm that we call Fuzzy infomap and detail in the 
Materials and Methods section. Here we give a short 
summary of Fuzzy infomap designed to provide good ap- 
proximate solutions for large networks. We start from 
Infomap's hard clustering of the network and then exe- 
cute the two-step algorithm. In the first step, we mea- 
sure the change in the description length when we assign 
boundary nodes, one by one, to multiple modules. This 
calculation is fast, but aggregating the changes in the 
second step is expensive and often requires recalculating 
all node-visit rates. Therefore, we rank the individual 
multiple module assignments and, in a greedy fashion, 
aggregate the individual best ones to minimize the de- 
scription length. 
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FIG. 2. The code structure of the map equation (a) with- 
out and (b) with overlapping modules. The color of a node 
in the networks and of the corresponding block in the code 
structures represents the module assignment, the width of a 
block represents the node-visit rate, and the height of the 
blocks represents the average codelength of code words in the 
codebooks. 



with intersections as nodes and roads as links. Many in- 
tersections at boundaries between modules are classified 
in multiple modules, because intersections only connect 
a few roads and the return rate of the random flow is 
relatively high. 

By contrast, compressing random flow in overlapping 
modules only gives a marginal gain over hard cluster- 
ing in the highly interconnected and directed network of 
C. Elegans, where less than three percent of the neurons 
are classified in multiple modules. Even if there is evi- 
dence that the neural network is modular, we most likely 
underestimate the degree of overlap with a random walk 
model of flow. 

In the middle of the table, the world air routes network 
shows a relatively low compression gain, given the many 
cities classified in multiple modules. For this network, 
the compression gain would be much higher if, instead 
of random flow on the links, we were to describe real 
passenger flow with a higher return rate. 



C. Overlapping modular organization in real- world 
networks 



To illustrate our flow-based approach, we have clus- 
tered a number of real- world networks. Figure |3] shows 
researchers organized in overlapping research groups in 
network science. The underlying co-authorship network 
is derived from the reference lists in the three review ar- 
ticles dl ini US]. In this weighted undirected network, 
we connect two researchers with a weighted link if they 
have co-authored one or more research papers. For ev- 
ery co-authored paper, we add to the total weight of 
the link a weight inversely proportional to the number 
of authors on the paper. Our premise is that two per- 
sons who have co-authored a paper have exchanged in- 
formation, information they can subsequently share with 
other researchers and induce a flow of information on the 
network. The map equation can capitalize on regular- 
ities in this flow, and Fig. [3] highlights one area of the 
co-authorship network with several overlapping research 
groups. For example, assigning Jure Leskovec to four re- 
search groups contributes to maximal compression of a 
description of a random walker on the network. Based 
on this co-authorship network, Leskovec is strongly asso- 
ciated with Dasgupta, Mahoney, Lang, and Backstrom, 
but also with groups at Cornell University, Carnegie Mel- 
lon University, Stanford University, and Yahoo Research. 
The size of the modules and the fraction of returning flow 
at the boundary nodes determine whether hard or fuzzy 
boundaries between research groups lead to optimal com- 
pression of flow on the network. 

Table |l] shows the level of compression and overlap of a 
number of real-world networks. The networks are sorted 
from highest to lowest compression gain when allowing 
for overlaps. We find the highest compression gain in 
the European roads network, which is a sparse network 



D. Comparing the map equation for overlapping 
modules with other methods 



Depending on the system being studied and the re- 
search question at hand, researchers develop clustering 
algorithms for overlapping modules based on different 
principles. For example, while some researchers take a 
statistical approach and see modules as non-random fea- 
tures of a network, other researcher use a local definition 
and identify independent and intersecting modules, or 
take a link perspective and assign all boundary nodes to 
multiple modules. Consequently, the final partitions are 
quite different, and it is interesting to contrast our infor- 
mation theoretic and flow-based approach, implemented 
in fuzzy infomap with these approaches, here represented 
by OSLOM [13], Cfique Percolation [3l], and Link Com- 
munities |16j . 

OSLOM defines a module as the set of nodes that max- 
imizes a local statistical significance metric. In other 
words, OSLOM identifies possibly overlapping modules 
that are unlikely to be found in a random network. 
Clique percolation identifies clusters by sliding fully con- 
nected k-cliques to adjacent k-cliques that share k-1 ver- 
tices with each other. A module is defined as the maxi- 
mal set of nodes that can be visited in chained iterations 
of this operation, and the overlaps consist of the shared 
nodes between modules that do not support the slide 
operation across the boundary. Finally, the Link Com- 
munities approach creates highly overlapping modules by 
aggregating nodes that are part of a link community. The 
link communities themselves are built using a similarity 
measure between links, the primal actors of the method. 

To compare the methods at different degrees of over- 
lap, we used a set of synthetic networks presented in 
ref. |32]. In Table |n we included six statistics for the 
four methods applied to synthetic networks with 1000 
nodes and three different degrees of overlap (see table 
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FIG. 3. Network scientists organized in overlapping research groups. The colors of the nodes represent overlapping research 
groups identified by the map equation and the pie charts represent the fractional association with the different research groups. 
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TABLE I. The overlapping organization and the level of compression of eight real-world networks. For each network with n 
nodes and I links, we report the hard partition compression C with Infomap, the additional compression with Fuzzy infomap, 
and the fraction of nodes that are assigned to multiple modules. 



Network 


n 


7 

L 




^■--^ fuzzy 


AT / M 


European roads network'^ 


1018 


1274 


46.2% 


10.4% 


35.5% 


Western states power grid'^ 


4941 


6994 


53.4% 


8.84% 


27.5% 


Ifuman diseases network^^ 


516 


1188 


46.4% 


2.87% 


15.3% 


Coauthorhip network!^ 


552 


1317 


48.9% 


2.47% 


14.6% 


World air routes'^ 


3618 


14142 


31.1% 


1.24% 


13.9% 


U.S. political Uogs^ 


1222 


16714 


4.13% 


0.35% 


5.81% 


Swedish political blogs'^ 


855 


10315 


0.50% 


0.18% 


4.79% 


Neural net. of C. Elegans^ 


297 


2345 


1.16% 


0.13% 


2.69% 





Partition numbers 


Codelength (bits) 




modules 


overlaps 


assignments 




module 




Low overlap 














Fuzzy Infomap 


44 


105 


1228 


1.7 


5.9 


7.6 


OSLOM 


44 


89 


1089 


1.8 


5.8 


7.6 


Clique Percolation 


43 


104 


1108 


1.7 


6.0 


7.7 


Link Communities 


3415 


1000 


9215 


8.1 


3.5 


12 


Medium overlap 














Fuzzy Infomap 


53 


303 


1830 


2.2 


6.0 


8.2 


OSLOM 


54 


276 


1277 


2.3 


5.9 


8.2 


Clique Percolation 


55 


268 


1283 


2.3 


6.1 


8.3 


Link Communities 


4457 


1000 


11628 


8.7 


3.5 


14 


High overlap 














Fuzzy Infomap 


56 


398 


1676 


2.6 


6.1 


8.8 


OSLOM 


61 


462 


1465 


2.8 


6.0 


8.8 


Clique Percolation 


73 


388 


1429 


2.9 


6.1 


9.0 


Link Communities 


4298 


1000 


11063 


10 


3.7 


11 



TABLE IL Comparing four different overlapping clustering 
methods. We run Fuzzy infomap, OSLOM, and Link commu- 
nities with their default settings and use clique size four for 
the Clique percolation method. All values are averaged over 
ten instantiations of random undirected and unweighted net- 
works with 1000 nodes and predefined community structure, 
generated with three different degrees of overlap |32]: Low 
overlap corresponds to 100, medium overlap corresponds to 
300, and high overlap corresponds to 500 nodes in multiple 
modules. All other parameters were held constant: The num- 
ber of nodes that multiply-assigned nodes are assigned to was 
set to two; each cluster consisted of on average 20 nodes with 
a minimum of 10 and a maximum of 50 nodes; and the power 
law exponent was set to —2 for of the node degree distribution 
and —1 for the module size distribution. Finally, the mixing 
parameter that controls the proportion of links within and 
between modules was set to 0.1. 



caption for details) . The first group of partition numbers 
describe the number of detected modules, the number of 
nodes that are assigned to multiple modules, and the to- 
tal number of assignments. To interpret the results from 
a flow perspective, we included the index, module, and 
total codelength for describing a random walker on the 
network given the network partition. 

Table [n] shows that Fuzzy infomap and OSLOM gener- 



ate similar partitions for low and medium degrees of over- 
lap, but the trend when going to higher degrees of overlap 
indicates fundamental differences. By assigning bound- 
ary nodes to more modules than OSLOM prefers. Fuzzy 
infomap identifies modules with longer persistence times. 
The shorter index codelength resulting from the fewer 
transitions compensates for the longer module codelength 
from the larger modules. As a result, with the over- 
lapping partitions generated by Fuzzy infomap, random 
flow can be described with fewer bits. But the difference 
is small and shows up only in the second decimal place 
when up to half of all the nodes are assigned to multiple 
modules. 

Clique percolation generates partitions with more 
modules but fewer assignments than both Fuzzy infomap 
and OSLOM. From a flow perspective, smaller modules 
with less overlap give more module switches that cannot 
be compensated for by a shorter module codelength. The 
strength of the Clique percolation method is the simple 
definition that allows for easy interpretation of the re- 
sults. 

Designed with links as the primal actors used to iden- 
tify pervasive overlap in networks, the results of Link 
Communities are quite different. For example, indepen- 
dent of the degree of overlap of the synthetic networks, 
each node belongs to on average ten modules. From the 
perspective of a random flow model, the persistence time 
is short in the many small modules, and the information 
necessary to encode the many transitions is much larger 
than for the other methods. This result is expected, as 
Link Communities is tailored to identify pervasive over- 
lap in social networks in which people belong to several 
modules and information flow is far from random. 

Often the performance is an important aspect to con- 
sider when choosing a clustering method. Therefore, we 
measured the time it took to cluster the synthetic net- 
works with the different clustering algorithms. We stress 
that we used presumably non-optimized research code 
made available online by its developers and that the per- 
formance, of course, depends on the network. Per 1000 
node synthetic network used in our comparison. Fuzzy 
infomap used on average 1.7 seconds for a single iter- 
ation of module growth and 240 seconds for multiple 
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growths, OSLOM used 330 seconds, the Chque Perco- 
lation method 1.5 seconds, and hnk communities were 
identified in 2.4 seconds. 

We conclude this comparison by stressing that the re- 
search question at hand must be considered when choos- 
ing a clustering method. Fuzzy infomap provides fast 
results that, for a random flow model, are similar to re- 
sults generated by OSLOM and the Clique percolation 
method, at least for moderate degree of overlap. On the 
other hand, for identifying pervasive overlap, researchers 
should consider Link Communities or a generalized flow 
model with longer persistence times in smaller, highly 
overlapping modules. 



(a) • ' ^ 




FIG. 4. Movements between nodes possibly assigned to mul- 
tiple modules, (a) and (b) Assuming that the random walker 
is in module i, she remains in module i when moving to node 
/3 if node (3 is assigned to module i. (b) But if node (3 is not 
assigned to module i, she switches with equal probability to 
any of the modules node /3 is assigned to. 



III. MATERIALS AND METHODS 

Here we detail the map equation for overlapping mod- 
ules and describe our greedy search algorithm. 

A. The map equation for overlapping modules 

Below we explain in detail how we derive the transition 
rates of a random walker between overlapping modules. 
We also derive the conditional probabilities for nodes as- 
signed to multiple modules. We then express the map 
equation (Eq. [T]) in terms of these rates, which allows for 
fast updates in the search algorithm. 

1. Movements between nodes assigned to multiple modules 

To calculate the map equation for overlapping mod- 
ules, we need the visit rates Pa^ for all modules i S Ma 
a node a is assigned to and the inflow and the out- 
flow Qir^ of all modules. We derive these quantities from 
the weighted and directed links Wap, which we normal- 
ize such that WajS correspond to the probability of the 
random walker moving to node /3 once at node a: 



fi 



if there is no link from a to j3 
otherwise 



(2) 



When necessary, we use random teleportation to guar- 
antee a unique steady state distribution [33]. That is, 
for directed networks, at rate r, or whenever the random 
walker arrives at a node with no out-links, the random 
walker teleports to a random node in the network. To 
simplify the notation, we set Wap = 1/n for all nodes a 
without out-links to all n nodes /3 in the network. 

The movements between multiply assigned nodes and 
overlapping modules are straightforward. Whenever the 
random walker arrives at a node that is assigned to mul- 
tiple modules, she remains in the same module if possible 
or switches to a random module if not possible. For ex- 
ample, assuming that the random walker is in module 



i, she remains in module i when moving to node (3 if 
node /3 is assigned to module i, i € Mp. But if node (3 
is not assigned to module i, i ^ Mp, she switches with 
equal probability 1/ \ Mp\ to any of the modules to which 
node j3 is assigned(see Fig.|4|. If we define the transition 
function 




\ii = i 

if i 7^ j and i ^ , (3) 
if i 7^ j and i G 



we can now define the visit rates by the equation system 



jeMp 



(1 - r) W/^a + T- 

n 



(4) 



We solve for the unknown visit rates with the fast itera- 
tive algorithm BiCGStab[33]. Since every node in mod- 
ule i guides a fraction (1 — r) J2p^i "^ap and teleports a 
fraction r of its conditional probability to nodes 
outside of module i, the exit probability of module i is 



\ ^ 7 

(1 - r) 2_^Wal3 + T- 



(5) 



where rii is the number of nodes assigned to module i. 



2. The expanded map equation for overlapping modules 

To make explicit which terms must be updated in a 
given step of a search algorithm, here we expand the en- 
tropies of the map equation (Eq. [T]) in terms of the visit 
and transition rates p^., qi^, and qin- When teleporta- 
tion is included in the description length as above, the 
outflow of modules balances the inflow, but here we de- 
rive for the general case when ^ qi^. 

We use the per-step probabilities of entering the mod- 
ules qi_^ to calculate the average code word length of the 
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index code words weighted by their rates of use, which is 
given by the entropy for the index codebook 



log 



(6) 



where the sum runs over the m modules of the modular 
partition. The contribution to the average description 
length from the index codebook is the entropy H ( Q) 
weighted by its rate of use q^, 



The only visible difference between this expression and 
the map equation for non-overlapping modules is the sum 
over conditional probabilities for nodes assigned to multi- 
ple modules, which is no longer independent of the over- 
lapping module partition M. But since the transition 
rates depend on the conditional probabilities (see Eq. [5|, 
all terms depend on the overlapping configuration. 



The greedy search algorithm for overlapping 
modules 



(7) 



Substituting Eq. [7] into Eq. [6) we can express the con- 
tribution to the per-step average description length from 
the index codebook as 



q-r^H [Q) ^ -q.^ y log2 

m 
i=l 

ra 

= g^loga?^ - ^(jj^log2(?j^. (8) 



We use the per-step probabilities of exiting the mod- 
ules qir^ and the visit rates "Pai to calculate the entropy 
of each module codebook: 



H (V) = - 



log2 

qm ~r- Z^^g 



l0g2 



Pa, 



E 



1 

POi 



Qir^ l0g2 (Jirv + ^Pa, ^Og2 Pat " Po log: 



2P0 



(9) 



with for the rate of use of the i-th module codebook, 



PIj = Qirx + X^Pft- 



(10) 



Finally, summing over all module codebooks, the de- 
scription length given by the overlapping module parti- 
tion M is 

L(M) = q_^ log2 9^ 

7n m 

- ^ log2 - ^ g^rv log2 gir> (11) 

i=l i=l 
m m 

- X^P"' ^°S2 Pa, +YpO l0g2 Po ■ 
i=l a€i i=l 



To detect the overlapping modular organization of a 
network, ultimately we want to find the global minimum 
of the map equation over all possible overlapping modular 
configurations of the network, but only with an exhaus- 
tive enumeration of all possible solutions can we guaran- 
tee the optimal solution. This procedure is, of course, 
impractical for all but the smallest networks. However, 
we can construct an algorithm that finds a good approx- 
imation. Figure [5] explains the concept of our algorithm, 
which builds on an iterative two-step procedure. 

In the first step, we individually assess which nodes are 
most likely to be assigned to multiple modules. Starting 
from a hard partition generated by Infomap[18 in the 
first iteration, we go through all nodes at the bound- 
ary between modules and assign each boundary node to 
adjacent modules. That is, one node and one adjacent 
module at a time, we assign the node to the extra module, 
measure the map equation change, and then return to the 
previous configuration (see Fig.[5](c)). Because the multi- 
ply assigned nodes only connect to singly assigned nodes 
in the first iteration, the conditional probabilities and 
the change in the map equation can be updated quickly 
without a full recalculation of the visit rates. This first 
step produces 3-tuples of local changes of the form ( node, 
extra-module, map-equation-change). 

In the second step, we combine a fraction of all local 
changes generated in the first step into a global solution. 
Every time two or more multiply assigned nodes are con- 
nected, we need to solve a linear system to calculate the 
conditional probabilities. When a majority of nodes are 
assigned to multiple modules, this can take as long as cal- 
culating the steady-state distribution of random walkers 
in the first place. For good performance, we therefore try 
to test as few combinations of local changes as possible. 
After testing several different approaches, we have opted 
for a heuristic method in which we first sort the tuples 
from best to worst in terms of map equation change and 
then determine the number of best tuples that minimizes 
the map equation. The method works well, because good 
local changes often are good globally. 

As a side remark, the map equation for link commu- 
nity [T7] allows for straightforward and fast calculation 
of all conditional probabilities and transition rates, since 
each link belongs to only one module. But this constraint 
enforces module switches between boundary nodes that 
belong to the same module, because all boundary nodes 
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(a) 



(b) 



(c) 



Input: Network and partition generated by infomap 
Result: Partition with overlaps 

repeat: 

Assign, one at a time, each boundary node to 

each of its neighboring modules, and calculate the 
codeLength change. 
Sort these assignations in a List according to the 

compression change, from best to worst. 
// Next loop answers the question: how many changes, in 
// the ranked order, is it optimal to take? 
do 

Aggregate the k first changes and use quadratic 
optimization to find the x that minimizes the 
description length, 
until no better minimum 
until there is no more compression gain. 






infomap 



grbracket 
iteration 1 



grbracket 
iteration 2 



(d) A 






number of applied tuples 



FIG. 5. General scheme of the two-step greedy search algorithm for overlapping modules, (a) Pseudocode with first step (c) 
and second step (d) of the algorithm that can be iterated as shown in (b). Starting from a hard partition generated by Infomap 
[18| . each iteration successively increases the overlap between modules to minimize the map equation for overlapping modules. 
In the first step (c), one by one, each boundary node is assigned to adjacent modules. In the second step (d), we first sort 
the local changes from best to worst and then iteratively apply quadratic fitting to find the number of best local changes that 
minimizes the map equation. 



belong to multiple modules in the link community ap- 
proach. 

Figure [sj^d) shows the value of the map equation as a 
function of the number of aggregated tuples ordered from 
best to worst. Combinations of tuples that individually 
generate longer description lengths can generate a shorter 
description length if they are applied together. This fact, 
together with the greedy order in which we aggregate the 
tuples, generates noise in the curve. To quickly approach 
the global minimum, we must overcome bad local minima 
caused by the noise and evaluate as few aggregations as 
possible. Therefore, we iteratively fit a quadratic polyno- 
mial to the curve by selecting new points at the minimum 
of the polynomial. A quadratic polynomial only requires 
three points to be fully specified, but in order to deal with 
the noise, we use a moving local least squares fit. In prac- 
tice, we evaluate around ten points for each quadratic fit 
and repeat this procedure a few times to obtain a good 
solution. 

Step 1 and step 2 can now be repeated, each time start- 
ing from the obtained solution with overlapping modules 
from the previous iteration. Figure [sjjb) illustrates that 
by repeating the two steps, we sometimes can extend the 
overlap between modules, but this comes at a cost. After 
the first iteration of the algorithm, step 1 also can involve 
solving a linear system to calculate the conditional prob- 
abilities. Thus, the first step is no longer guaranteed to 
be as fast as in the first iteration. Still, for medium-sized 



networks, multiple iterations are feasible. For example, 
for the networks presented in Table |l] the first iteration 
took a few seconds and multiple iterations until the point 
of no further improvements took less than two minutes on 
a normal laptop. We have made the code available here: 
https : // sites . google . coiii/site/alcidesve82/. 



IV. CONCLUSIONS 

In this paper, we have introduced the map equation for 
overlapping modules. When we allow nodes to belong to 
multiple module codebooks and minimize the map equa- 
tion over possibly overlapping network partitions, we can 
determine which nodes belong to multiple modules and 
to what degree. Compared to hard partitions detected 
by the map equation, we have further compressed de- 
scriptions of a random walker on all tested real-world 
networks, and therefore revealed more regularities in the 
flow on the networks. We find the highest overlapping 
modular organization in sparse infrastructure networks, 
but this result depends on our random-walk model of 
fiow. Since the mathematical framework is not limited 
to random flow, it would be interesting to compare our 
results with results derived from empirical flow. 
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